Lattice Field Theory on Cluster Computers: Vector- Vs. Cache-Centric Programming

نویسندگان

  • Christoph Best
  • M. Peardon
  • Norbert Eicker
  • P. Ueberholz
  • Thomas Lippert
  • Klaus Schilling
چکیده

We evaluate the possibility of moving medium-sized calculation in lattice field theory from vector supercomputers to cluster computers, namely clusters built from Alpha processors and Myrinet interconnect, and find that a mediumsized system with a performance of 10 to 20 GFlop/s can be easily and cost-effectively built from current off-the-shelf components. The performance of the algorithms is analyzed with respect to memory bandwidth problems by experiment and using a cache simulator that uses C++ operator overloading. It seems that cluster systems, while hampered by poor memory bandwidth as compared to supercomputers, might offer opportunities for some algorithms that have good locality but are not vectorizable and thus will not perform well on vector systems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Single Processor Performance of Simple Lattice Boltzmann Kernels

This report presents a comprehensive survey of the effect of different data layouts on the single processor performance characteristics for the lattice Boltzmann method both for commodity “off-the-shelf” (COTS) architectures and tailored HPC systems, such as vector computers. We cover modern 64-bit processors ranging from IA32 compatible (Intel Xeon/Nocona, AMD Opteron), superscalar RISC (IBM P...

متن کامل

Cluster-based In-networking Caching for Content-Centric Networking

With the Internet architecture changing from host-centric communication model to content-centric model, Content Centric Networking (CCN) has emerged. One distinctive feature of CCN infrastructure is in-networking caching. As cache capacities of routers are relatively small compared with delivered data size, one challenge of in-networking caching is how to efficiently use the cache resources. In...

متن کامل

Fast Parallel I/O on Cluster Computers

Today’s cluster computers suffer from slow I/O, which slows down I/O-intensive applications. We show that fast disk I/O can be achieved by operating a parallel file system over fast networks such as Myrinet or Gigabit Ethernet. In this paper, we demonstrate how the ParaStation3 communication system helps speed-up the performance of parallel I/O on clusters using the open source parallel virtual...

متن کامل

Apple-CORE: Microgrids of SVP cores

To harness the potential of CMPs for scalable, energy-efficient performance in general-purpose computers, the Apple-CORE project has co-designed a general machine model and concurrency control interface with dedicated hardware support for concurrency management across multiple cores. Its SVP interface combines dataflow synchronisation with imperative programming, towards the efficient use of pa...

متن کامل

Parallel Spatial Pyramid Match Kernel Algorithm for Object Recognition using a Cluster of Computers

This paper parallelizes the spatial pyramid match kernel (SPK) implementation. SPK is one of the most usable kernel methods, along with support vector machine classifier, with high accuracy in object recognition. MATLAB parallel computing toolbox has been used to parallelize SPK. In this implementation, MATLAB Message Passing Interface (MPI) functions and features included in the toolbox help u...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999